314 research outputs found

    Mortality Prediction Models with Clinical Notes Using Sparse Attention at the Word and Sentence Levels

    Full text link
    Intensive Care in-hospital mortality prediction has various clinical applications. Neural prediction models, especially when capitalising on clinical notes, have been put forward as improvement on currently existing models. However, to be acceptable these models should be performant and transparent. This work studies different attention mechanisms for clinical neural prediction models in terms of their discrimination and calibration. Specifically, we investigate sparse attention as an alternative to dense attention weights in the task of in-hospital mortality prediction from clinical notes. We evaluate the attention mechanisms based on: i) local self-attention over words in a sentence, and ii) global self-attention with a transformer architecture across sentences. We demonstrate that the sparse mechanism approach outperforms the dense one for the local self-attention in terms of predictive performance with a publicly available dataset, and puts higher attention to prespecified relevant directive words. The performance at the sentence level, however, deteriorates as sentences including the influential directive words tend to be dropped all together.Comment: Technical Reports at the Department of Medical Informatics, Amsterdam UMC, 2021. https://kik.amc.nl/KIK/reports/TR2021-01.pd

    Unmasking the Chameleons: A Benchmark for Out-of-Distribution Detection in Medical Tabular Data

    Full text link
    Despite their success, Machine Learning (ML) models do not generalize effectively to data not originating from the training distribution. To reliably employ ML models in real-world healthcare systems and avoid inaccurate predictions on out-of-distribution (OOD) data, it is crucial to detect OOD samples. Numerous OOD detection approaches have been suggested in other fields - especially in computer vision - but it remains unclear whether the challenge is resolved when dealing with medical tabular data. To answer this pressing need, we propose an extensive reproducible benchmark to compare different methods across a suite of tests including both near and far OODs. Our benchmark leverages the latest versions of eICU and MIMIC-IV, two public datasets encompassing tens of thousands of ICU patients in several hospitals. We consider a wide array of density-based methods and SOTA post-hoc detectors across diverse predictive architectures, including MLP, ResNet, and Transformer. Our findings show that i) the problem appears to be solved for far-OODs, but remains open for near-OODs; ii) post-hoc methods alone perform poorly, but improve substantially when coupled with distance-based mechanisms; iii) the transformer architecture is far less overconfident compared to MLP and ResNet

    Evaluation of SOFA-based models for predicting mortality in the ICU: A systematic review

    Get PDF
    Introduction To systematically review studies evaluating the performance of Sequential Organ Failure Assessment ( SOFA)based models for predicting mortality in patients in the intensive care unit (ICU). Methods Medline, EMBASE and other databases were searched for English-language articles with the major objective of evaluating the prognostic performance of SOFA-based models in predicting mortality in surgical and/or medical ICU admissions. The quality of each study was assessed based on a quality framework for prognostic models. Results Eighteen articles met all inclusion criteria. The studies differed widely in the SOFA derivatives used and in their methods of evaluation. Ten studies reported about developing a probabilistic prognostic model, only five of which used an independent validation data set. The other studies used the SOFA-based score directly to discriminate between survivors and non-survivors without fitting a probabilistic model. In five of the six studies, admission-based models ( Acute Physiology and Chronic Health Evaluation (APACHE) II/III) were reported to have a slightly better discrimination ability than SOFA-based models at admission ( the receiver operating characteristic curve (AUC) of SOFA-based models ranged between 0.61 and 0.88), and in one study a SOFA model had higher AUC than the Simplified Acute Physiology Score (SAPS) II model. Four of these studies used the Hosmer-Lemeshow tests for calibration, none of which reported a lack of fit for the SOFA models. Models based on sequential SOFA scores were described in 11 studies including maximum SOFA scores and maximum sum of individual components of the SOFA score ( AUC range: 0.69 to 0.92) and delta SOFA ( AUC range: 0.51 to 0.83). Studies comparing SOFA with other organ failure scores did not consistently show superiority of one scoring system to another. Four studies combined SOFA-based derivatives with admission severity of illness scores, and they all reported on improved predictions for the combination. Quality of studies ranged from 11.5 to 19.5 points on a 20-point scale. Conclusions Models based on SOFA scores at admission had only slightly worse performance than APACHE II/III and were competitive with SAPS II models in predicting mortality in patients in the general medical and/or surgical ICU. Models with sequential SOFA scores seem to have a comparable performance with other organ failure scores. The combination of sequential SOFA derivatives with APACHE II/III and SAPS II models clearly improved prognostic performance of either model alone. Due to the heterogeneity of the studies, it is impossible to draw general conclusions on the optimal mathematical model and optimal derivatives of SOFA scores. Future studies should use a standard evaluation methodology with a standard set of outcome measures covering discrimination, calibration and accurac

    Using Non-Primitive Concept Definitions for Improving DL-Based Knowledge Bases

    Get PDF
    Medical Terminological Knowledge Bases contain a large number of primitive concept definitions. This is due to the large number of natural kinds that are represented, and due to the limits of expressiveness of the Description Logic used. The utility of classification is reduced by these primitive definitions, hindering the knowledge modeling process. To better exploit the classification utility, we devise a method in which definitions are assumed to be non-primitive in the modeling process. This method aims at the detection of: duplicate concept definitions, underspecification, and actual limits of a DL-based representation. This provides the following advantages: duplicate definitions can be found, the limits of expressiveness of the logic can be made more clearly, and tacit knowledge is identified which can be expressed by defining additional concept properties. Two case studies demonstrate the feasibility of this approach

    Soft-prompt tuning to predict lung cancer using primary care free-text Dutch medical notes

    Full text link
    We investigate different natural language processing (NLP) approaches based on contextualised word representations for the problem of early prediction of lung cancer using free-text patient medical notes of Dutch primary care physicians. Because lung cancer has a low prevalence in primary care, we also address the problem of classification under highly imbalanced classes. Specifically, we use large Transformer-based pretrained language models (PLMs) and investigate: 1) how \textit{soft prompt-tuning} -- an NLP technique used to adapt PLMs using small amounts of training data -- compares to standard model fine-tuning; 2) whether simpler static word embedding models (WEMs) can be more robust compared to PLMs in highly imbalanced settings; and 3) how models fare when trained on notes from a small number of patients. We find that 1) soft-prompt tuning is an efficient alternative to standard model fine-tuning; 2) PLMs show better discrimination but worse calibration compared to simpler static word embedding models as the classification problem becomes more imbalanced; and 3) results when training models on small number of patients are mixed and show no clear differences between PLMs and WEMs. All our code is available open source in \url{https://bitbucket.org/aumc-kik/prompt_tuning_cancer_prediction/}.Comment: A short version of this paper has been published at the 21st International Conference on Artificial Intelligence in Medicine (AIME 2023

    Association between dementia parental family history and mid-life modifiable risk factors for dementia:a cross-sectional study using propensity score matching within the Lifelines cohort

    Get PDF
    OBJECTIVE: Individuals with a parental family history (PFH) of dementia have an increased risk to develop dementia, regardless of genetic risks. The aim of this study is to investigate the association between a PFH of dementia and currently known modifiable risk factors for dementia among middle-aged individuals using propensity score matching (PSM). DESIGN: A cross-sectional study. SETTING AND PARTICIPANTS: A subsample of Lifelines (35–65 years), a prospective population-based cohort study in the Netherlands was used. OUTCOME MEASURES: Fourteen modifiable risk factors for dementia and the overall Lifestyle for Brain Health (LIBRA) score, indicating someone’s potential for dementia risk reduction (DRR). RESULTS: The study population included 89 869 participants of which 10 940 (12.2%) had a PFH of dementia (mean (SD) age=52.95 (7.2)) and 36 389 (40.5%) without a PFH of dementia (mean (SD) age=43.19 (5.5)). Of 42 540 participants (47.3%), PFH of dementia was imputed. After PSM, potential confounding variables were balanced between individuals with and without PFH of dementia. Individuals with a PFH of dementia had more often hypertension (OR=1.19; 95% CI 1.14 to 1.24), high cholesterol (OR=1.24; 95% CI 1.18 to 1.30), diabetes (OR=1.26; 95% CI 1.11 to 1.42), cardiovascular diseases (OR=1.49; 95% CI 1.18 to 1.88), depression (OR=1.23; 95% CI 1.08 to 1.41), obesity (OR=1.14; 95% CI 1.08 to 1.20) and overweight (OR=1.10; 95% CI 1.05 to 1.17), and were more often current smokers (OR=1.20; 95% CI 1.14 to 1.27) and ex-smokers (OR=1.21; 95% CI 1.16 to 1.27). However, they were less often low/moderate alcohol consumers (OR=0.87; 95% CI 0.83 to 0.91), excessive alcohol consumers (OR=0.93; 95% CI 0.89 to 0.98), socially inactive (OR=0.84; 95% CI 0.78 to 0.90) and physically inactive (OR=0.93; 95% CI 0.91 to 0.97). Having a PFH of dementia resulted in a higher LIBRA score (RC=0.15; 95% CI 0.11 to 0.19). CONCLUSION: We found that having a PFH of dementia was associated with several modifiable risk factors. This suggests that middle-aged individuals with a PFH of dementia are a group at risk and could benefit from DRR. Further research should explore their knowledge, beliefs and attitudes towards DRR, and whether they are willing to assess their risk and change their lifestyle to reduce dementia risk

    Identification of high-risk subgroups in very elderly intensive care unit patients

    Get PDF
    INTRODUCTION: Current prognostic models for intensive care unit (ICU) patients have not been specifically developed or validated in the very elderly. The aim of this study was to develop a prognostic model for ICU patients 80 years old or older to predict in-hospital mortality by means of data obtained within 24 hours after ICU admission. Aside from having good overall performance, the model was designed to reliably and specifically identify subgroups at very high risk of dying. METHODS: A total of 6,867 consecutive patients 80 years old or older from 21 Dutch ICUs were studied. Data necessary to calculate the Glasgow Coma Scale, Acute Physiology and Chronic Health Evaluation II, Simplified Acute Physiology Score II (SAPS II), Mortality Probability Models II scores, and ICU and hospital survival were recorded. Data were randomly divided into a developmental (n = 4,587) and a validation (n = 2,289) set. By means of recursive partitioning analysis, a classification tree predicting in-hospital mortality was developed. This model was compared with the original SAPS II model and with the SAPS II model after recalibration for very elderly ICU patients in the Netherlands. RESULTS: Overall performance measured by the area under the receiver operating characteristic curve and by the Brier score was similar for the classification tree, the original SAPS II model, and the recalibrated SAPS II model. The tree identified most patients with very high risk of mortality (9.2% of patients versus 8.9% for the original SAPS II and 5.9% for the recalibrated SAPS II had a risk of more than 80%). With a cut-point at a risk of 80%, the positive predictive values were 0.88 for the tree, 0.83 for the original SAPS II, and 0.87 for the recalibrated SAPS II. CONCLUSION: Prognostic models with good overall performance may also reliably identify subgroups of very elderly ICU patients who have a very high risk of dying before hospital discharge. The classification tree has the advantage of identifying the separate factors contributing to bad outcome and of using few variables. Up to 9.5% of patients were found to have a risk to die of more than 85

    Factors that predict outcome of intensive care treatment in very elderly patients: a review

    Get PDF
    INTRODUCTION: Advanced age is thought to be associated with increased mortality in critically ill patients. This report reviews available data on factors that determine outcome, on the value of prognostic models, and on preferences regarding life-sustaining treatments in (very) elderly intensive care unit (ICU) patients. METHODS: We searched the Medline database (January 1966 to January 2005) for English language articles. Selected articles were cross-checked for other relevant publications. RESULTS: Mortality rates are higher in elderly ICU patients than in younger patients. However, it is not age per se but associated factors, such as severity of illness and premorbid functional status, that appear to be responsible for the poorer prognosis. Patients' preferences regarding life-sustaining treatments are importantly influenced by the likelihood of a beneficial outcome. Commonly used prognostic models have not been calibrated for use in the very elderly. Furthermore, they do not address long-term survival and functional outcome. CONCLUSION: We advocate the development of new prognostic models, validated in elderly ICU patients, that predict not only survival but also functional and cognitive status after discharge. Such a model may support informed decision making with respect to patients' preferences

    Tight glycemic control and computerized decision-support systems: a systematic review

    Get PDF
    Objective: To identify and summarize characteristics of computerized decision-support systems (CDSS) for tight glycemic control (TGC) and to review their effects on the quality of the TGC process in critically ill patients. Methods: We searched Medline (1950-2008) and included studies on critically ill adult patients that reported original data from a clinical trial or observational study with a main objective of evaluating a given TGC protocol with a CDSS. Results: Seventeen articles met the inclusion criteria. Eleven out of seventeen studies evaluated the effect of a new TGC protocol that was introduced simultaneously with a CDSS implementation. Most of the reported CDSSs were stand-alone, were not integrated in any other clinical information systems and used the "passive'' mode requiring the clinician to ask for advice. Different implementation sites, target users, and time of advice were used, depending on local circumstances. All controlled studies reported on at least one quality indicator of the blood glucose regulatory process that was improved by introducing the CDSS. Nine out of ten controlled studies either did not report on the number of hypoglycemia events (one study), or reported on no change (six studies) or even a reduction in this number (two studies). Conclusions: While most studies evaluating the effect of CDSS on the quality of the TGC process found improvement when evaluated on the basis of the quality indicators used, it is impossible to define the exact success factors, because of simultaneous implementation of the CDSS with a new or modified TGC protocol and the hybrid solutions used to integrate the CDSS into the clinical workflo

    The impact of a computerized decision aid on empowering pregnant women for choosing vaginal versus cesarean section delivery: study protocol for a randomized controlled trial

    Get PDF
    Cesarean delivery on maternal request (CDMR) is one of the main reasons for cesarean delivery in Iran, and women often need help in making a decision about the delivery options available to them. The main objective of this study is to evaluate the effect of a computerized decision aid (CDA) system on empowering pregnant women in choosing an appropriate mode of delivery. This CDA contrasts the advantages and disadvantages of vaginal versus cesarean section delivery in terms of their value to the individual woman. The protocol concerns a randomized trial study that will be performed among Iranian women. Four hundred pregnant women will be recruited from two private and two public prenatal centers in Mashhad, Iran. They will be randomly assigned to either an intervention or a control group. The designed CDA will be provided to the intervention group, whereas the control group will only receive routine care. The CDA provides educational contents as well as some recommendations. The CDA's knowledge base is obtained from the results of studies on predictors of cesarean delivery. The CDA's software will be installed on women's computers for use at home. The two primary outcomes for the study are O'Connor's Decisional Conflict Scale and knowledge as measured by true/false questions. Actual mode of delivery (vaginal versus cesarean) will be compared in the two groups. We investigate the effect of a CDA on empowering pregnant women in terms of reducing their decisional conflict as well as on improving their clinical knowledge pertaining to mode of delivery. This trial is registered with the Iran Trial Registrar under registration number IRCT2015093010777N4 and registration date 26 October 201
    corecore